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Abstract Recent work by Induikya et. al. discusses the optimal partitioning of random distributed pro- 
grams. They conclude that the optimal partitioning of a homogeneous random program over a homo- 
geneous distributed system either assigns all modules to a single processor, or distributes the modules 
as evenly as possible among all processors. Their analysis rests heavily on the approximation which 
equates the expected maximum of a set of independent random variables with the set’s maximum 
expectation. In this paper we strengthen Indurkya’s results by providing an approximation-free proof 
of this result for two processors under general conditions on the module execution time distribution. 
We also show that use of this approximation causes two of Indurkya’s central results to be false. 
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I. Introduction 

Indurkhya et al. introduce a random model of distributed programs in [3]. This model supposes 
that a distributed program consists of N modules, each having a random non-negative execution time. 
The modules’ execution times are assumed to be independent and identically distributed. The 
program’s modules are partitioned among P processors; a module will communicate with any other 
given module with probability p. Given that two modules in different processors communicate, the 
delay cost of that communication is random, independent and identically distributed as the cost of any 
other interprocessor communication. Then in [3], the problem of optimally distributing the modules of 
such a program is analyzed under several simplifying assumptions. A number of these assumptions 
concern the measurement of the cost of a partition: the cost function adopted is the sum of the 
expected execution time of the busiest processor with the expected total communication cost. This cost 
function was adopted for tractability reasons; this function does not take into account any time that a 
module must wait for a communication to reach it. More significantly, their analysis assumes that for 
independent random variables X^ 2 > ' • • ,X n , 

£[max{X!, • • • ,X„}] = max^], • • • ,£[*„]}. 

This assumption (which we will call Al) is false; for example, the expected maximum of two indepen- 

3 

dent identically distributed exponential random variables with mean (i is — ji. There is some error 

analysis for this assumption in [3]; however, we will show that this analysis docs not apply at a solu- 
tion point given by approximation Al. A fuller analysis of the expected maximum statistic is found in 
the study of order statistics [1]. 

The main result in [3] is that when the random program is partitioned for a system of homogene- 
ous processors, the optimal partition has one of two extreme forms. Either the modules are distributed 
as evenly as possible among the processors, or all modules are assigned to the same processor. As this 
conclusion rests on a mathematically incorrect assumption, a natural question is whether this result is 
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rigorously true. In this paper we show that for a broad class of module execution time probability dis- 
tributions, the result is always true for two processors. We also point out that the error analysis given 
in [3] does not apply at a solution point derived in [3], and illustrate by example that the mechanism 
given in [3] for determining the optimal two processor partition is flawed as a result of the erroneous 
assumption. We provide a counter-example to a P processor theorem in [3], and again show how this 
error follows directly from assumption Al. 

This paper is organized in the following fashion. Section II introduces the problem’s computa- 
tional model, illustrates the problems with using assumption Al, and shows that the error analysis in 
[3] for this assumption fails at a critical point, Section III treats the optimal partitioning for two proces- 
sors, and gives the same result as given in [3]: the optimal partition either assigns all modules to one 
processor, or distributes them as evenly as possible. Section IV considers the P processor results given 
in [3]. We give a counter-example to Theorem 2 in [3], and show why this theorem fails. The failure 
of this theorem invalidates the proof of the main P processor result in [3]. Section V summarizes our 
results. 

II. Computational Model 

Consider a distributed computer system consisting of P identical processors which communicate 
over some common bus. The program to be distributed consists of N modules; for simplicity we 
assume that N is even. Each module has a random execution time, distributed as a non-negative ran- 
dom variable R with finite mean r. A module’s execution time is assumed to be independent of any 
other. In addition, we assume that R is in a certain sense bounded by the exponential random variable 
exp(r) with mean r. We assume that expir) is stochastically more variable than R, denoted exp(r) >, R 
(see [4] for a discussion of this relation). Formally, expir) > v R means that E[h(exp(r))] > E[h(R)] for 
all increasing convex functions hr, informally, this assumption means that exp(r ) has a larger variance 
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than R. Because the variance of an exponential is quite large, this assumption is not overly restrictive. 
We also assume that the family of R's convolutions is monotone in likelihood ratio. Denoting the j- 

t », 

fold convolution of R by /?(/), this assumption means that whenever j > i, then R(j) R(i). A ran- 
dom variable X (with density function j) is said to be larger than random variable Y (with density func- 
tion g), in likelihood ratio, denoted X >m Y, if 

J&L < JQL whenever x < y. 
g(x) g(y ) 

A discussion of the relation is found in [4]; discrete random variables may be related by >m if 
their mass functions satisfy a similar requirement. Common distributions which have monotone likeli- 
hood ratio convolution families, are the gamma and Poisson distributions. 

Partitioning a random program consists of assigning each module to one of the available proces- 
sors. For every i and j, (i * j), module i will communicate with module j with some probability p. If i 
communicates with j, but i and j reside in different processors, a random delay cost C is incurred; 
E[C ] = c. This delay cost is assumed to be independent of and identically distributed as every other 
communication delay cost. No communication cost is suffered for communication between co-resident 
modules. 

The execution time for a processor is assumed to be the sum of its resident modules’ execution 
times. The cost function adopted in [3] adds the mean maximum processor execution time with the 
mean total communication cost The assignment which places k modules in one processor, and N - k 
modules in another has a mean execution cost of 

3*(*) = £[max( £/?,-, £ *,}] 

t= 1 i=k+l 

where each /?,• is an instance of the random variable R. To compute the expected communication cost 
for this assignment, we note that k(N - k) communication links are possible, and that a link exists with 
probability p, independent of any other. The mean cost associated with an extant link is c, so that the 
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mean total communication delay is given by 

k(N-k) 

r c (k) = £[ S pCi\ 

1=1 

= pck(N — k) 

where, each C ; is an instance of the random variable C. The total cost of this assignment is taken to be 
A(k) = T R (k) + T c (k). Note that this cost function docs not attempt to capture any synchronization 
between modules. A fuller explanation of this computational model is given in [3]. 

Following these definitions, it is assumed in [3 J that T R {k ) is given by 


k N 

T R (k) = max{£[£ /?,], E[ £ Rj\) 

i-l i~k + 1 

which is equivalent to 


T R (k) = kr 

when k > N/2. This is a reasonable assumption when N is large and k is close to N\ it can otherwise 
be a poor approximation. Furthermore, approximation Al’s error is accentuated by the number of ran- 
dom variables involved. For example, the expected maximum of n independent identically distributed 
exponential random variables with mean r is given by 
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In each case, assumption A1 approximates this mean with r. In fact, Jensen’s Inequality [3] states that 
for any independent random variables Xj, • • • J( n , and convex function g, 

i 

Elg(X lt - • • , XJ] > g (£[X,], • • • E[X n ] ). 

Because the max function is convex, assumption At gives a lower bound on the true expectation. 

Since there is a notable discrepancy between A1 and the expected maximum of a group of 
independent and identical exponentials, it is instructive to investigate the differences between this 
example and the error analysis for A1 provided in [3]. First, our example considered exponentials, the 
error analysis considered normals. Secondly, the eiTor analysis in [3] is asymptotic, applying when the 
number of modules becomes large. However, neither of these considerations is important when com- 
pared to the fact that the error analysis in [3] does not apply when modules are evenly distributed, as 
assumed in our example. In [3], inequality (27) cited from [2] bounds the probability that a normal 
random variable R 2 is greater than a normal random variable R\, in terms of the mean and variance of 
Ri - R 2 . The inequality cited from [2] applies only if £[7?!] is strictly greater than E[R 2 \, a fact over- 
looked in [3]. Note that if £[/?i] = E[R 2 ], and var(/?j) = var(R J, then the probability that R 2 exceeds 
7?i is 1/2, regardless of the values for the means and variances. But this corresponds to the even distri- 
bution of modules across the processors, one of the solution points of the distribution problem under 
assumption Al. Our analysis avoids assumption A1 by considering analytical properties of the assign- 
ment cost function A(k). 

We will focus on the convex and concave nature of certain functions. A function g is convex if 
for every X and Y in its domain, 

g(XX + (1 - X)Y) < Xg(,X) + (1 - X)g(Y) for all Xe [0,1]. 
g is concave if this inequality is reversed. In our analysis, g’s domain is usually the non-negative 

integers I. In this case, g is convex if 


gO'+l) - gif) ^ S(0 ~ £’(*-!) for all ie I 



-7- 


and g is concave if this inequality is reversed. We next employ these definitions to the distribution 
problem with two processors. 

III. Optimal Partitioning for Two Processors 

Consider the partitioning of a random program for a two processor system. We will show that 
the assignment cost function A(k) has no local minimum for integer k e [N/2, N], This directly implies 
that the partition minimizing A{k) either distributes the modules equally between the two processors, or 
places all modules on one processor. This result is derived by establishing convexity and concavity 
properties of T R (k) and T c (k). To simplify our notation, we let R(k) denote the Mold convolution of 
the random variable R. Then T R (k) is given by 

T R (k) = E[ma\{R(k), R(N-k)}]. 

Unless otherwise stated, all random variables we discuss are assumed to be independent 

The key results for this problem are that T R (k) - T R (k- 1) is a concave function of k, and that 
T R (k) is a convex function of k. The proof of this claim is detailed, and is found in Appendix A. 

THEOREM 1 : 

• T R (k) is convex in t, 

» 

• For M2 < k< N, T R (k ) - T R (k-l) is increasing and concave in k. 

□ 

The convexity of T R (k) is illustrated by figure 1, where N = 20 and R is an exponential with 
mean 1. Figure 1 also illustrates the lower bound given by Jensen’s Inequality. Figure 2 illustrates the 
concavity of T R (Jk ) - T R (k-l) under these same assumptions. 

To help show that A{k) has no local minimum over [M2, N ], we define 
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m = r R (k) - T R (k- 1 ) 


and 


e(k) = Tdk) - T c {k- 1) = -pc(k - 1) 

over the interval [N/2+l,N], Note that Theorem 1 states that 8(£) is increasing and concave. For k >NI 2 


we may write 


T R (k) = T r (NI 2) + Z SO) 

j = A72+1 


and 


Tdk) = TdNI2) + Z eO). 

j = NI2+\ 

The idea now is to use the functions 5(k) and e(/k) to show, (1) if A(k) decreases between k = N/2 and 
k = N/ 2+1, then it decreases over its entire domain, and (2) if A(k) increases between k = NI2 and 
k = NI2+\, then there exists at most one point £ where A(k) "turns" in direction by changing from 
increasing (decreasing) to decreasing (increasing). If (1) applies, then there is clearly no local 
minimum for the objective function. If (2) applies, then the objective function initially increases, then 
potentially decreases, but cannot turn from decreasing to increasing. This too clearly implies that no 
local minimum exists. 

We may consider 8 and e to be continuous functions formed by taking the linear interpolation 
between their discretely defined values. If the objective function decreases between /:-! and k, then 


T R (k) + Tdk) < T R (k—\) + Tdk- 1 ) 

<=> T R (k) - T R (k- 1) < -\rdk) ~ 7c<*-l)] 


<^> 8 (k) < le(k)l. 

An immediate implication of this observation is that we can find points at which A(k) turns in direction 
by finding points where the functional curves of 5(k) and le(£)l intersect. Theorem 1 states that 5 (k) is 
concave in k , ; furthermore, z(k) is linear in k. We suppose first that 8 exceeds e at the leftmost domain 
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point k = A72+1. le(A72+l)l < 5(M2+1) occurs if A(k) increases between k = N/2 and k = N/2+1. Both 
le(fc)l and 8(£) are increasing in t, since one is linear and the other concave, it is not possible for their 
functional curves to intersect more than once, as illustrated by Figure 3. If the functional curves for 
8 (k) and e(k) do not intersect, then A(k) increases over its entire domaia If they do intersect, A(k) ini- 
tially increases, and then decreases. No local minimum is achieved in either case. 

We next suppose that 5(A72+1) < le(N/2+l)l. A general linear function which exceeds 6 (k) at 
k = N/2 + 1 could intersect 8(k) twice; however, we show that 8(k) < le(£)l for all k > N/2, so that A(k) 
is strictly decreasing over its domain. This is established by showing that the slope of le(£)l is greater 
than the slope of the segment of 8 (k) between k = N/2+1 and k = N/2+2. Since 8 (k) is concave, the 
slopes of its segments arc decreasing in k\ it will follow that \c(k)\ never intersects 8(k). Now the slope 
of le(it)l for k > N/2 is seen to be 2e(N/2+\). We therefore wish to establish that 

2e(A72+l) > 8(N/2+2) - 8(N/2+l). 

Since e(N/ 2+1) > 8(vV/2+l) by assumption, it will suffice to show that 

28(A72+1) > 8(N/2+2) - S(A72+1). 

Simple algebra shows that this latter inequality is equivalent to 

T R (N/2+l) > ^T k (N/2) + ±T r (N/ 2+2). (5) 

Because of its length, the proof of inequality (5) is given in Appendix B. The veracity of inequality (5) 
implies that 8(&) < le(£)l for all k > A72+1, so that A(k) is decreasing over its entire domain. We have 
thus established Theorem 2. 

THEOREM 2 : A(k) has no local minimum over [N/2, N] and is therefore minimized at either 
k = N/2, or k = N. 

□ 

Figure 4 illustrates the behavior of A(k) when N = 20, R is exponential with mean 1, and pc = 0.1. 
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Theorem 2 shows that to determine the optimal partitioning, we compare the costs of two parti- 
tions. If N is even, the optimal partitioning will place all modules on a single processor if 

12 


N_ 

2 


( 6 ) 


Nr < E[max{R(N/2), R(N/2)}] + pc 
This expression is easily modified if N is odd. Under assumption Al, a derivation in [3] shows that 
the optimal partition places all modules on one processor if and only if 


Nl 2 > rl(pc). (6.A1) 

This statement is significant in that it says we need only know the number of modules, mean module 

execution cost, and mean inter-module communication cost to determine the optimal two processor par- 
tition. However, this claim is not true in practice. For example, consider the simple case where N = 2, 

and r = c= 1. For any positive p < 1, we have NI2 = 1 < — = rKpc), so that according to (6.A1) the 

P 

optimal partition distributes the two modules. However, if the modules have exponential execution 
times, the expected maximum execution time is 3/2. Inequality (6) is then satisfied for any p > 1/2, 
when the optimal partition places both modules in the same processor. Thus we see that the determina- 
tion of the optimal partition depends in part on the variance of R, not simply the mean; approximation 
Al leads to analysis which is insensitive to variation in module execution times. 

IV. P Processor Results 

Approximation Al is used by [3] to derive results concerning partitions for P processors. In this 
section we point out how Al leads to theorems given in [3] which do not hold unless R is constant. 

Theorem 2 in [3] characterizes the optimal partitioning under the constraint that the heaviest 
loaded processor has exactly m modules. This theorem provides us with a powerful tool for determin- 
ing the optimal partitioning of any P processor problem in 0(N - NIP ) time; we need only consider all 
possible loads on the heaviest loaded processor. However, we will show that Theorem 2 cannot be 
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trusted when the module execution times are random. We both give a counter-example to this 
theorem, and illustrate a range of parameter values for which this theorem fails to hold. 

One useful derivation given in [3] is to show that the mean communication cost of the assign- 
ment which, for j = 1,2, • • • ,P, places kj modules on processor j is 


T c (N,ki, • ' ‘ *r) = jPc[N'-lZ-£- • • • - ^ • 


(7) 


We will appeal to this equation when we discuss communication costs. We now paraphrase Theorem 
2, and then give a counter-example to its statement. 


Theorem 2: Under the constraint that a definite number of modules, say m, are to be as- 
signed to a processor, and no other processor is to be assigned more than m modules, the op- 
timal assignment is defined as follows. Let I be the largest integer such that ml < N. Exactly 
/ processors will have m modules, and the remaining N - ml modules are assigned to one 
other processor. 

□ 


Consider the assignment of four independent exponential random variables R with mean 1 to four 
processors. According to the statement above, the cost of assigning two modules to two processors 
(called the 2-2 assignment) is less than the cost of assigning two modules to one processor, and one 
each to two other processors (called the 2-1-1 assignment). The expected maximum execution costs for 
this example can be derived analytically. We first consider the execution cost of the 2-2 assignment: 


M 22 = £[max{7?(2), /?(2)}] = 


1 - Prob{R(2) < t} i 2 dt 


-i 


1 - (1 - e~‘ - te"') 2 


dt 


i = 2.75 

where the last step results from expanding the squared term and integrating each piece separately. The 

4 

execution cost A/ 211 of the 2-1-1 assignment is found in a similar fashion, and is 2 — . According to 
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(7) above, the communication cost for the 2-2 assignment is Ape, and the communication cost for the 
2-1-1 assignment is 5 pc. To counter Theorem 2 we need to find a cost pc such that 

M 22 "h Ape > A /211 A" 5 pc. 

Substituting the numerical values for A/ 22 and A/ 2ll , we see that this is equivalent to determining pc 
such that 

M 2 2 - A/ 2 ii = 0.31 > 5 pc - Ape = pc. 

This counter-example highlights the cause of failure in Theorem 2. Its proof in [3] depends on 
the assumption that the mean maximum execution time does not change if a load balance is performed 
between lightly loaded processors (a result which follows directly from approximation Al). We found 
an example where the mean maximum execution time does change, and were then able to construct a 
counter-example. Furthermore, for any value of r, it is possible to find values of pc for which 
Theorem 2 fails to hold. In fact, it is not difficult to prove the following lemma: 

LEMMA 3 : Suppose k\ > k 2 > k 2 > ■ • ■ >k P . Then 

E[max{R(k{),R(k 2 ), ■ • • Ji(k P )}] > Etmaxt/e^j-D^^+l),/?^), • • • ,R(k P )}l 

□ 

Lemma 3 shows that moving a module to better balance the assignment cannot increase the 
expected maximum execution time; furthermore, if R is unbounded (like an exponential) this inequality 
will be strict. It is shown in [3] that by balancing as described in Lemma 3, the communication cost 
increases by pc(k x - k 2 + 1). Lemma 2 says that the execution cost decreases by balancing. It is pos- 
sible then to choose a value of pc so that the increase in communication cost is less than the decrease 
in execution cost. 

The central P processor result in [3] (given as Theorem 3) states that under the constraint that all 
utilized processors have the same number of modules, the optimal partition is extremal. However, the 
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proof of this result rests both on Theorem 2 and approximation Al. We have empirically tested this 
result using a wide range of values for N, P, and various different distributions for R. All of our tests 
substantiated Theorem 3 ' s conclusion. Clearly, a more rigorous proof of this result is called for. 

V. Summary 

Indurkhya et al. in [3] consider the interesting problem of distributing program modules whose 
execution and communication behavior arc characterized probabilistically. They conclude that the 
optimal assignment is extremal: either all modules arc placed on one processor, or the modules are dis- 
tributed as eyenly as possible. Their analysis rests on an approximation which can be quite inaccurate. 
We have strengthened their work by showing that for a general class of module execution time distri- 
butions, it is possible to derive this conclusion in ihe case of two processors without employing this 
approximation. However, we also show that two significant conclusions drawn in [3] are false because 
of the approximation. One conclusion characterizes the optimal two processor partition, the other 
characterizes the optimal P processor partition under a particular constraint, and implies that the 
optimal partition for a general problem can be determined in CRN — NIP ) time. Furthermore, this con- 
clusion is central to the proof of their P processor optimal partition extremity result. While empirical 
studies suggest that the optimal P processor partition is also extremal, further work is needed to 
rigorously establish this result. 
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Appendix A 

In this appendix we prove Theorem 1. To show that T R (k) is convex, and that T R (k+ 1) - T R (k) is 
concave, we will show that the function 

A (*) = \r R (k+l) - - [w - T R (k- l)j 

= T R (k+\) - 2T R (k) + T R (k —\ ) 

is non-negative, and decreasing. Observe that A(k) is twice the difference between the linear interpola- 
tion at k between endpoints T R (k- 1), T R (k+ 1), and the value T R (k). As such, A (k) measures the convex- 
ity of the function by its deviation from a linear function. Let s , t, u, and v be non-negative real 
numbers, and define 

D(s,t,u,v ) = ma\[s+u+v, r} - 2max{j4-M, /+v} + max{^, t+u+v] 

■ I 

so that 

A (*) = E[D(R(k-l), R(N-k-\), R lt R 2 )] 

where Ri and R 2 are independent instances of R, and the expectation is taken with respect to the joint 
distribution of all random variables referenced. We demonstrate the desired properties of A (k) by first 
conditioning on the values of Ri and R 2 . Let u and v be fixed; straightforward algebra shows that the 
value of D(s,t,u,v ) then depends only on the relationship of s-t to u and v. To emphasize this fact, we 
change our notation for D to D(s-t,u,v), and note that A (k) = E[D{R(k- 1) - R(N-k- 1) , R x , R£]. For 
fixed u and v, D(s-t,u,v ) is a piece-wise linear function described by the following four cases. 

Case s — t < — (w + v): D(s,t,u,v ) = u — v; 

Case ~{u + v) < s - t < v - u: D(s,t,u,v ) = s - t + 2u\ 

Case v — u < s-t < u + v: D(s,t,u,v ) = t — s + 2v; 

Case u + v < s-t: D(s,t,u,v ) = v - u. 

Figure 5 illustrates the behavior of D(s-t,u,v ) for both the case where u > v, and the case where u < v. 
Figure 6 then illustrates the behavior of D(s-t,u,v) + D(s-t,v,u). We observe that this sum is always 
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non-negative, is symmetric about 0, and is decreasing for s-t > 0. Let p(u,v) be the probability density 
(or mass, if R is discrete) function for the joint distribution of R j and R 2 . Since the event that R^ = u 
and R 2 — v has the same probability density as the event that /?j = v and R 2 = u, it follows that 
p(u,v) = p(v,u) for all u and v. Thus 

2 p(u,v)- 

an expression easily modified if R is discrete. As a function of s-t, E[D(s-t, R { , R 2 )] is also non- 
negative, symmetric about 0, and decreasing for s-t > 0, since it is a positively weighted sum of func- 
tions which have these properties. 

A (k) is the expected value of the function E[D(s-t, R } , R 2 )] with respect to the random variable 
R(k- 1) - R(N-k-l). Clearly then, A (k) is always non-negative, so that T R (k) is convex. To show that 
T R (k) - T R (k-1) is non-negative, we cite Lemma 3. To show that T R (k) - T R {k- 1) is concave, we 
argue that A {k) is decreasing in k. Letting f k (x), denote the density function for R{k- 1) - R(N-k- 1), we 
observe by symmetry that 

oo 

A(k) = J E[D(x, R h R 2 )¥k(x) dx 

—oo 

oo 

= | E[D(x, R h /? 2 )] ^(x) + A(-x)] dx 
= E[DQR(k-l ) - R(N-k- 1)1, R v R 2 )l 

We will now show that l/?(£-l) - R(N-k- 1)! is stochastically larger than \R(k- 2) - R(N-k) I, that is, 

Prob{\R(k-\) - R{N-k- 1)1 > f) > Prob[\R{k-2) - R(N-k) 1 > t) f or ah t > 0. 

Let Z = R(k-2) - R(N-k-l), and let f 7 (x) be its density function. Then the inequality above is 

equivalent to 

/ 

Prob{ IZ + /?l > r) > Prob{\Z - R\ > t } f or all t > 0. (8) 

Recall that we have assumed that R's family of convolutions is monotone in likelihood ratio; in 


E[D(s-t, R u i? 2 )l = 


= 11 


\D(s-t,u,v) + D(s- 


dv du 
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particular, that R(k- 2) —LR R(N-k- 1). This relation implies that whenever x > 0, then / z (-x) < / 2 (x). 
Using this fact, it is straightforward to show that 

Prob{\Z + r\ > t] > Prob{\Z. - rl > t) for all t, r > 0, 
which implies inequality (8). Thus I Z + R\ is stochastically larger than IZ - R\. In [4] is is shown that a 

random variable X is stochastically larger than random variable Y if and only if £[g(X)] > £[g(y)] for 

all increasing functions g (equivalently, that £[g(X)] < E[g(Y)] for all decreasing functions g). This 

immediately implies that 

A(/fc) = £[D(IZ + £1, /?], £;>)] 

< £[D(1Z -R\,R lt Rd)] 

= A(k-l). 

Since A(k) decreases in k, it follows that T R (k) - T R (k- 1) is concave in k. 

Appendix B 

, This appendix shows that under our assumptions about the random variable R, it is true that 

T R (N/2+l) > ^T r (NI 2) + ±T r (NI2+2). 

We will first establish this result for the smallest N for which this result applies, N = 4. In this case, 
we must show that 

£[max{/?(3), £( 1)}] > 4£[max{/?(2), R(2)}] + -±-4r, 

4 4 

or, 

£tmax{£(3), R( 1)}] - -- £[max{/?(2 ), R( 2)}] > r. (9) 

4 

If R is a random variable with a larger variance but the same mean as R, then we can expect that 

£[max {£(£), R(J)}] > £[max {£(/;), R(j)} 1 

for any integer k and j. This inequality is formally derived in the event that R is stochastically more 
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variable than R, or R > v R. (the theory of stochastic variability is treated in [4]). Recall that we have 
assumed that exp(r) > v R, where exp(r) is the exponential random variable with mean r. Now, inequal- 
ity (9) holds when R is exponential, and it holds when R is constant. The left hand side of (9) is larger 
given constant R (1.5/") than it is with exponential R (1.0625r). We see then that (9) is true for any R 
dominated by the exponential: the term £[max{R(2), R( 2)}] is more sensitive to increasing, variance in 
R than is £[max{/?(3), /?}]; this fact explains why the left side of (9) is smaller for exponential R than 
it is for constant R. 

\ t 

We now; argue that 

T r (N/ 2+1) > -j7)}(AV2) + -jT r (NI. 2+2) 

for general (even) N > 4. This argument is aided by Figure 5 which depicts the inequality. We see 
that the difference 

’ *- 

Tr(NI2+1) - ^T r (N/2) + ±T r (N/ 2+2) (10) 

. - 

is an (inverse) measure of convexity, measured as the deviation of T R (N/ 2+1) from the linear interpola- 
tion between T R (N/2) and T R (N/2+2). But as we increase N, that convexity will decrease. This is easily 
seen by referring again to Figure 5 which depicts the general properities of the function 
E[D(s-t, R\, R 2 )] described in appendix A. As N increases, the variance of R(N/2) - R(N/2) increases, 
placing more probability weight on the tails of the distribution. The effect of this on 
E[D(R{NI2)' - R(NH), R x , /? 2 )1 is t° decrease its value. Thus the convexity decreases in N; so that the 
value of expression (10) will increase. Since this expression is positive when N = 4, and increases in 
N, inequality (10) holds in general. 
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Fig. 2: Concavity of T (k) - T (k-1) 
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Fig. 3: | £ (k) | and 6(k) Intersect Once 
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Fig. 7: Convexity Measure 
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